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Abstract 

Small-scale deformation phenomena such as subgrain formation, devel- 
opment of texture, and grain boundary sliding require simulations with a 
high degree of spatial resolution. When we consider finite-element sim- 
ulation of metal deformation, this equates to thousands or hundreds of 
thousands of finite elements. Simulations of the dynamic deformations 
of metal samples require elastic-plastic constitutive updates of the ma- 
terial behavior to be performed over a small time step between updates, 
as dictated by the Courant condition. Further, numerical integration of 
physically-based equations is inherently sensitive to the step in time taken; 
they return different predictions as the time step is reduced, eventually 
approaching a stationary solution. Depending on the deformation con- 
ditions, this converged time step becomes short ( 10~ 9 s or less). If an 
implicit constitutive update is applied to this class of simulation, the ben- 
efit of the implicit update (i.e., the ability to evaluate over a relatively 
large time step) is negated, and the integration is prohibitively slow. The 
present work recasts an implicit update algorithm into an explicit form, 
for which each update step is five to six times faster, and the compute time 
required for a plastic update approaches that needed for a fully-elastic up- 
date. For dynamic loading conditions, the explicit model is found to per- 
form an entire simulation up to 50 times faster than the implicit model. 
The performance of the explicit model is enhanced by adding a subcy- 
cling algorithm to the explicit model, by which the maximum time step 
between constitutive updates is increased an order of magnitude. These 
model improvements do not significantly change the predictions of the 
model from the implicit form, and provide overall computation times sig- 
nificantly faster than the implicit form over finite-element meshes. These 
modifications are also applied to polycrystals via Taylor averaging, where 
we also see improved model performance. 
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1 Introduction 



The explosion in computing power over the last few decades has heightened 
the aspirations of numerical analysts. Simulations previously thought difficult 
are now commonplace; those once considered impossible or impractical are now 
merely time-consuming. Increased computing power has seen an additional 
focus on the behavior of polycrystalline metals, of both face-centered cubic and 
body-centered cubic crystal structure. 

The basis for modern crystal plasticity is modeling of single crystals. Av- 
eraging techniques for polycrystals rely upon a well-formulated representation 
of the single crystal. Alternately, direct numerical simulation of polycrystals 
models a particular multi-crystalline material by tracking the interactions 
between several single crystals. Both of these methods for capturing the behav- 
ior of polycrystals are computationally intensive; the goal of the modeler at the 
constitutive level is thus to provide the fastest model possible while retaining 
the underlying physics of the model (and, by extension, the model accuracy). 

The fundamental importance of single crystal modeling is reflected in the 
abundance of theories on monocrystalline plasticity in the literature. Such the- 
ories begin to appear in the literature in the early 20 th century and are presented 
by many investigators, including: Taylor EH : Schmid |57]; Bishop 0]; Hill 
PI : Rice |2E|; Hutchinson Q3]; Asaro 0, [5; Havner HO], HH; Bassani 0; and 
Kocks We seek neither to provide a comprehensive review of these theories, 
nor do we wish to provide a comparison among them. Rather, we will use the 
update procedure of Cuitiho and Ortiz jS], first in an implicit form as originally 
presented. In the interest of increasing the computational speed of the model, 
we will present an explicit form of this model. Then, in view of the relatively 
short time steps allowed by the explicit integration, we arrive at a convergence 
condition for the explicit integration. Using this condition, we are able to de- 
vise a subcycling scheme that allows the explicit integration to converge over 
larger user-specified time steps. We illustrate these model improvements with 
numerical examples, which include the computation time required for each. 

2 Constitutive Framework 

As stated earlier, we are applying the model of Cuitiho and Ortiz 5 for our 
work here. We briefly review the fundamentals of this model here in order to 
make the current work as self-contained as possible. 

Our current model follows the lead of numerous previous authors, including 
but not limited to Lee |17|. |16j : Kratochvil ^B]; Green and Naghdi [5]; Mandel 
H3; Nemat-Nasser E2; Onat g3j; Loret QU; and Dafilias The underlying 
assumption common to all these theories is that the overall deformation gradient 
F can be decomposed into an elastic component F e and a plastic part F p , as: 

F = F e F p (1) 

The existence of such a multiplicative decomposition implies that there is 
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some stress-free intermediate configuration which contains the deformation due 
to plastic slip only; lattice distortion and rotation are presumed to be contained 
in F e . The plastic deformation gradient is assumed to be volume-conserving. 
These assumptions ensure that the decomposition Q is unique. The deforma- 
tion power per unit undeformed volume can thus be written 

P : F = P : F C + £ : V (2) 

- pT 

where P = PF is a first Piola-Kirchhoff stress tensor relative to the in- 
termediate configuration, and X = F e PF pT is a stress measure conjugate to 
the plastic velocity gradients on the intermediate configuration, given by L p = 
F p F pT . The work conjugacy relations J2J imply forms for the plastic flow rule 
and elastic stress-strain relations, i.e., 

L p = L p (S, Q) 

P = P(F e ,Q) (3) 

where Q represents the appropriate internal variables defined on the inter- 
mediate configuration, subject to appropriate evolution equations (hardening 
laws). The most general form of the second of © that is material- frame indif- 
ferent is 

P = F e S (C e ) (4) 
where C e = F eT F e is the right Cauchy-Green deformation tensor on the in- 

— — e— 1 — 

termediate configuration, and S = C X is a symmetric second Piola-Kirchhoff 
stress tensor on the intermediate configuration. For metals, we can assume a 
linear relation between S and the elastic Lagrangian strain on the intermediate 
configuration, E e = (C e — I) /2 without loss of generality. Higher-order moduli 
are available, for example, see Teodosiu |3l)j. 

Rice nas shown that the formulation of L p used here has the structure 

L p = ^7 Q s a ®m a (5) 

a 

where j a is the shear rate on slip system a, which has slip direction s" and 
normal vector m Q . We follow the usual assumption that these slip rates depend 
on stress through the corresponding resolved shear stress r Q only, meaning 

r=r(r a ,Q) (6) 

Peirce, et al |25| and several others have proposed a power law representation 
for the slip rates, 

^= J >0 (7 ) 

I 0, otherwise 

where m is a strain-rate sensitivity exponent, 70 is a reference strain rate, and 
g a is the current flow stress on slip system a. As noted in the literature \'21\ . 
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|18|. this formulation returns unrealistic slip strain rates for values of r a /g a 
much different than unity. Hence, we use the form of this law presented by 
Cuitiho and Ortiz 



7o 



9 r ' 



(8) 



where it is assumed that 7 Q = if r Q < g a . Note that we have assumed that 
slip in any system must be positive. That is, the combination of direction s Q 
and normal vector m" is taken to be a different system than the combination of 
direction —s a and normal vector m Q . Hence, slip will only occur for r Q > g a . 
This modification to the power law representation removes the singularity often 
seen at t q = g a , naturally introducing the result of zero slip velocity when the 
resolved shear stress and flow stress are equal. The hardening relation governing 
the value of g a is given as 

g a = J2 ha ^ ( 9 ) 

P 

where h a ^ are the hardening moduli. These hardening moduli are provided 
from a statistical analysis based on the analysis of Ortiz and Popov [21], with 
the result 



K (t) 



2[r° (f)]- 



cosh 



(*) 



[r a (t)Y 



where 



(t) = afib^Trn a (t), h c (t) 



(t) 



(10) 



(11) 



7 C Q (*) 

are a characteristic shear stress and plastic modulus, a is a coefficient (on the 
order of 0.3), b is the Burgers vector, fi is the shear modulus, n (t) is the area 
density of forest dislocations intersecting the slip plane of system a, and the 
characteristic strain "f c is dependent upon the Burgers vector, the dislocation 
area density in system a, and the average distance between point obstacles. 
The off-diagonal (cross-hardening) terms h a/3 , a ^ (3 are taken to be zero. 
The effect of slip in one system on the hardening characteristics of another is 
presumed to be described by the forest dislocation density n a . Francosi and 
Zaoui |S] determined interaction parameters a Q/3 describing the dependence of 
the forest obstacles seen in system a on the dislocation density in system /3, (J 3 : 



(12) 



These parameters a Q/3 are available for both FCC and BCC crystals 0. 
For the case of quasi-static deformation of an FCC crystal, these dimensionless 
parameters are as follows: 
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a 


8 x 1CT 4 


ax/a 


5.7 


a 2 /a 


10.2 


a 3 /a 


16.6 



Table 1: FCC Interaction Coefficients 

2.1 Implicit formulation 

This constitutive framework was presented in the context of a fully implicit up- 
date scheme by Cuitiho and Ortiz. The general update procedure is to presume 
the unknowns of the deformations are the slip strain rates 7° , and to take as in- 
put the updated overall deformation gradient in the undeformed configuration, 
F n +i. This is accomplished by discretizing the viscosity law and solving it for 
the resolved shear stress r" +1 , 

< + i=V>(^p<£ + i) (13) 

where we note that <?" +1 is a function of A7, the vector of all Aj", through 
the hardening law Since we may write 

r a = (C e s a ) T Sm a (14) 
we find that we can express <|13fl as a function of the form 

f a (A 7 ) = r^ +1 - $ gf l+ ^j = (15) 

which may be solved using a Newton-Raphson iteration. This is possible 
because the Jacobian matrix of fT5|l may be computed explicitly 0, [T£] , This 
implicit approach converges rapidly, within two or three Newton-Raphson iter- 
ations. 

2.2 Explicit formulation 

While the implicit formulation exhibits good convergence properties, it (like 
most implicit integration schemes) is best suited to a larger increment in time. 
Explicit finite-element simulations (the sort to which we wish to apply our 
model) inherently use small time integration steps, limited by the time taken for 
a wave traveling at the Rayleigh speed to cross an element. Hence, an explicit 
formulation for the update is more appropriate for the application we intend. 

The constitutive framework for the explicit form of the model remains un- 
touched. Additionally, the unknowns of the crystal plasticity problem remain 
unchanged. That is, we design our constitutive update to take F n+1 as an input, 
solving for the unknown slip shear rates. Instead of solving iteratively for the 
ensemble of slip rates, we determine the shear rates in a sequential manner. The 
evaluation at step t n+ i is based on the hardening information and slip velocities 
from step t n . The explicit procedure used here can be summarized as follows: 
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1. Calculate g a , h aa for all systems based on step t n . 

2. Compute F e = F„ + iF p and evaluate r a for all systems. 

3. If t q < g a for all systems, go to step 6. Otherwise: 

4. Apply AF P = 7" At (s a ® m") due to the unused slip system a for which 

T a _ ga jg l ar gg S t. 

5. Premultiply F p by AF P , return to step 2 using this result. 

6. Compute new slip rates 7" +1 and hardening moduli h aa . 

Instead of iterating on the ensemble of slip systems, we now activate the 
system having the largest overstress and repeat until no unused systems exist 
for which the resolved shear stress exceeds the flow stress from the previous 
step. We are using the previous step as a predictor for the state during the 
next step. Logically, then, we expect the quality of the prediction to diminish 
as we attempt to take longer time steps At, which is true. As we will see in 
Section |3 the time required for an integration approaches that used for a fully 
elastic step. Depending on the slip activity during step t n , the iteration will 
diverge for values of At that are excessively large. Simulations on finite element 
meshes have shown that this value of At may be smaller than the maximum At 
allowed by the mesh. In other words, the maximal time step achieved by such a 
simulation would depend not upon the chosen mesh, but on the material model. 
We found this result to be unacceptable, and devised a solution described below. 

2.3 Subcycling formulation 

In order to work around the maximal time step limitation of the explicit model, 
we first need to understand why the model fails. The crystal plasticity model 
we have chosen requires that r a > g a for all active slip systems a. Further, 
since the explicit formulation uses the hardening data from step t n to predict 
the slip rates at t n +i, we may write 

r^ +1 =r(j n: At,FP n ) (16) 

That is, the resolved shear stress depends only on the slip increments, the 
time step, and the plastic deformation gradient at time t n - This dependence 
is manifested through the determination of AF P described in step 4 of the 
procedure above. The change in the flow stress on a system a is evaluated as 

9n+l = 9n + K a ln^ (17) 

which is linear in At. The relation (|16fl depends only on At, albeit non- 
linearly. Since we require r Q > g a , we can set Ijlfill equal to (J17JI and solve 
iteratively for At c , where we define At c as the maximum ("critical") time in- 
crement for which our constraint is satisfied. 
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In practice, we do not need the actual value of At c in order to proceed with 
the integration. We are merely interested in whether we need to invoke the sub- 
cycling algorithm. If the given At < At c , the evaluation may proceed without 
subcycling. If At > At c , subcycling is activated. It should be noted that the 
increments used for the deformation gradient within the subcycling procedure 
must be consistent with the multiplicative decomposition. The integration pro- 
ceeds by dividing the overall step At into smaller increments At = — < At c , 
where n is the number of subcycles. To summarize, the procedure we use is 
this: 

1. Test the condition r" > g a for all systems active at time t n+ \. If true, go 
to step 4. 

2. Evaluate the desired AF. Bisect the desired time step, test this step ap- 
plying AF inc = (AF) 1/2 . 

3. Repeat until condition in step 1 is satisfied. 

4. Exit, returning the number of subcycles n, the time step At = At/n, and 
the incremental deformation (AF) 1 '" . 

This procedure has the effect of adding more evaluations of the explicit step 
for each global step, since the convergence test itself is an explicit evaluation. 
However, we will see from the relative computational speed of the models that 
this small net loss of time is converted into a large gain. In general, not ev- 
ery step of a simulation requires the largest amount of subcycles to complete. 
Without subcycling, we would be constrained by the smallest At c required by 
any step. The subcycling implementation takes advantage of the larger At c 
values available in most steps. Additionally, for a large-scale simulation, not 
every quadrature point in the mesh requires subcycling at a given time step. 
It may well be, for a dynamic simulation, that part of the material is highly 
plastic while another is still in the elastic range. The subcycling implementa- 
tion allows the steps that are easier to evaluate to be finished quickly, without 
wasting many steps moving through the less computationally-intensive elastic 
region. In this way, the subcycling algorithm is analogous to an adaptive time 
step that activates only when necessary. 

Note that, in actuality, we never solve for the critical time step At c . Instead, 
we solve for the largest step At/2 n such that the condition in step 1 is satisfied. 
Our chosen method has two useful benefits. First, bisection is nearly trivial 
to implement and is computationally inexpensive (no derivatives to evaluate, 
for example). Second, while we do not arrive at the critical time step, we 
return a time step At < At c . Since At c is the maximum step we can take while 
maintaining convergence, the method we have chosen provides a built-in error 
tolerance to the time step evaluation. If we try to solve for At c exactly, we may 
find that small, unavoidable numerical errors in this solution would cause the 
integration to fail. The bisection solution helps reduce or eliminate this pitfall, 
resulting in a more robust crystal plasticity algorithm. 
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Figure 1: Variations in implicit model results with time step. 



3 Accuracy of the Algorithm 

We will evaluate our explicit model and subcycling implementation in two 
phases. First, we will compare the results from the implicit model to the ex- 
plicit, with the goal of proving that the computational speed we gain does not 
come at too steep of a price in accuracy. Then, we will apply the algorithm to 
some larger-scale simulations, highlighting the increase in processing speed seen 
in using the subcycling algorithm. 

An important consideration in comparing the results from the implicit and 
explicit algorithms is time-convergence (stationarity) of the results, especially 
for the implicit model. Most numerical integrations exhibit some form of time- 
step dependency; Figure ^ illustrates the dependency for several simulations 
of the quasi-static deformation of a copper crystal in tension along the [112] 
crystal axis. If we choose a time step At = 10, the model predicts a different 
response than At = 1 or At = 0.1. Note that the predicted responses for the 
latter two time steps differ by 0.008% of the smaller value. If we require our 
model responses to be stationary for smaller time steps, then we should use At 
of no larger than 1. Since we wish to apply our models to dynamic finite-element 
simulations using very small time steps, i.e., O (10 -8 ) , stationarity of the results 
with respect to the time step is a necessary condition. The results presented 
here match well with experimental data we list the model parameters used 
in Table H 



Figure is of the same form as Figure d but for the explicit model. The 
time- converged step is around At — 10 -4 , while the results at At = 10~ 3 are 
shown to be vastly different. 



8 



Elastic Constant Cn 


168.4 GPa 


Elastic Constant Cyi 


121.4 GPa 


Elastic Constant C44 


75.4 GPa 


9o 


2.0 MPa 


7o 


10 sec" 1 


m 


0.1 


a [sec Eq. llllllj 


0.3 


b 


2.56 x lO" 10 


Initial dislocation density, po 


10 12 m-' 2 


Saturation dislocation density, p sat 


10 15 m~ 2 


Saturation strain, j sa t 


0.5% 



Table 2: Definition of Symbols 
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Figure 2: Time-convergence behavior of the explicit model. 
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Our next step is to show that the explicit model with subcycling gives results 
similar to the explicit model without subcycling. In order to illustrate this part, 
we applied the implicit model and explicit model with and without subcycling to 
a single-integration-point simulation of a rolling test. The crystal is compressed 
at high strain rate (~ 5000/s) along its [001] axis, with the global X — Y 
axes at 45° to the crystal x — y axes. The global Y face is constrained. We 
choose this sort of test for two reasons: First, this test is closer to the type 
of simulation we wish to perform with the explicit model. Second, we do not 
wish for the constitutive tangents to cloud the results, since we intend to apply 
our subcycling algorithm to simulations not requiring the constitutive tangents. 
Figures 13161 show the time-convergence behavior for the implicit, explicit and 
subcycling models. Figure compares the results for a time-converged implicit 
and explicit model to results that include the use of subcycling (Ai~10 -8 ). The 
difference among the models at 15% reduction is 0.08% of the smallest value 
(in this case, the implicit). If we take a smaller time step with the subcycling 
model, the At c constraint is not violated and the subcycling model becomes 
the explicit model exactly. While we see oscillations in the force-deformation 
response when we apply subcycling, these oscillations are in general about the 
converged solution. As expected, the subcycling formulation continually over- 
and under-predicts the stress response, eventually settling on the correct value. 
We gain about an order of magnitude in the time step using subcycling. If we 
attempt to take a larger step than what is depicted, we find that the simulation 
no longer converges at a very early point in the deformation (within the first 50 
steps). 

A closer analysis of the output shows that the subcycling tests that failed had 
large oscillations in the stress-strain curve, as demonstrated by the simulation 
using At = 10~ 8 . This manner of failure is due to the issue of time-convergence. 
If the original input time step At is greater than the converged time step, then 
steps that involve subcycling will be on a time-converged solution curve, while 
any steps that do not involve subcycling may not be on the time-converged 
curve. The overall solution, then, runs on two paths; this condition eventually 
causes the overall solution to diverge. 

While these oscillations are a concern at the integration-point level, this 
effect tends to diminish within the framework of a large-scale finite-element 
solution. For such simulations, the subcycling implementation acts to make 
the system more robust. This is because of the necessarily smaller time steps 
used in the finite-element solution. These smaller time steps are more likely to 
be converged 1 . The difference between the subcycling path and the smoother 
path without subcycling will be less dramatic. Another important point from 
this analysis is that the subcycling algorithm used here does not increase the 
effective time step over which the model can progress without bound. There 
are two reasons for this. The first is numerical precision; that is, the AF~ 
calculations lead to matrices that approach the identity as n — * oo. For steps 

1 Certainly, the smaller time step is closer to the converged step than the single-integration- 
point steps above 
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of smaller than about 10~ 15 , the difference between 1 and (1 + e) is no longer 
computationally resolved. Hence, if we try to take steps smaller than this, the 
AF matrix will be seen by the machine as the identity. Secondly, our subcycling 
algorithm uses an estimate over the entire desired time step before taking the 
first step. This estimate will be good so long as the velocity in every slip system 
(j a ) decreases across the number of cycles. If we activate new systems in a 
subcycle, or if the velocity in a previously activated system exceeds the initial 
estimate, we are no longer guaranteed that the model will converge. It may 
be argued, then, that the current subcycling algorithm should be modified to 
be recursive, testing both the initial time step and that of each subcycle. The 
present formulation, however, has proven to be sufficient; the benefit gained 
by introducing a recursive subcycling algorithm may well be outweighed by 
the effort required in its implementation, both in terms of programming and 
computation time. 

We close our discussion of the constitutive-level results with comparisons of 
the computational time between the implicit and explicit integration. Figure 
|H1 presents the variation of computation time for one integration point versus 
deformation for the rolling-type test used above. (We omit the times for the 
subcycling model since the core of the integration is the explicit model.) We 
obtained these curves by running each update over the same set of parameters 
50 times and averaging the processor time used for the full set. These results 
show that the explicit calculation is about five to six times faster than the 
implicit. While by no means dramatic, this gain of time is significant; an implicit 
simulation lasting a week would (at the same strain step) take about a day for 
the explicit model to complete. Even for the constitutive-level analysis in this 
section, we can see the interplay between the largest stationary time step and 
the compute speed. The tensile tests (Figures ^ and |2J) at low strain rate allow 
the implicit model to take a step four orders of magnitude longer than the 
explicit, more than counteracting the half order of magnitude of compute speed 
gained by the explicit model. However, the rolling-type tests presented above 
require the implicit model to take a time step on the order of At = 10~ 10 and 
the explicit model 2 to take a time step on the order of At — 10~ 9 . In this case, 
the explicit model is simply faster, by an overall factor of about 50. 

4 Large- Scale Simulation Results 

The motivation for our foregoing discussion was the application of the de- 
rived material models to large-scale simulations requiring many integrations 
over small time steps. Here we provide a few examples of such tests performed 
using these algorithms. 

2 Without subcycling. 
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Figure 3: High-rate test using the implicit formulation. The results using At — 
1Q- 10 and At = 1CT 11 differ by 0.06% at 15% reduction. 
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Figure 4: Time stationarity test for compressive deformation using the explicit 
model. The two curves here differ by 0.0009% at 15% reduction. 
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Figure 5: Convergence test using subcycling. After the initial oscillations, the 
curves shown here are exactly atop one another. 




Figure 6: Showing the larger-scale oscillations introduced in the subcycling 
response for increased time steps. Larger steps than At = 3 x 10 -8 diverge. 
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Figure 7: Converged curves for implicit, explicit and subcycling models. 
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Figure 8: Comparison between compute times per second for the implicit and 
explicit update formulations. 
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4.1 Rolling Test of FCC Aluminum 

We begin with a 1 mm aluminum cube oriented such that the crystal z axis 
lies on the global Z axis, and the other two axes defined by a 45° rotation 
about the crystal z axis. This cube is subjected to a strain rate of ~ 5000/s 
along the global Z axis. The cube is constrained so that expansion in the global 
Y direction is not possible, but it may deform freely in all other directions. The 
results below show the force applied versus the net reduction, and the initial and 
final textures. The qualitative behavior of the force-deformation curve agrees 
with the physical interpretation of the simulation. The simulation captures the 
fluctuations in the applied force due to wave reflection from the opposite face 
of the cube, which are eventually damped out. To help make the case for our 
subcycling algorithm, we present results for the explicit model both with and 
without our subcycling algorithm. 

The effects of adding subcycling to the explicit model are striking. We will 
use an integration over a tetrahedral mesh of a cubic sample, modeling one 
grain in each direction (initial shape given in Figure |5J deformed in a rolling 
experiment as a sample case. Without the addition of subcycling, a simula- 
tion using the explicit integration model described above for an FCC material 
would require eight hours before the thickness reaches 50% reduction, using 
eight processors. This is not due to a limitation of the mesh, but rather of the 
material model; the material model is found to require a stable time step of 
about 25% that required by the tetrahedron geometry. If we try the identical 
simulation using the subcycling algorithm described above, we find that the 
maximum stable time step becomes that of the mesh - the material model is no 
longer the limiting factor. As a result, this same simulation on the same eight 
processors requires about fifteen minutes to reach 50% reduction. The gain of 
time is nonlinear because not all quadrature points require subcycling at any 
one time. It may be that only one of the approximately 380 integration points 
requires a smaller step at a given time. Without a subcycling algorithm, that 
one integration point would be enough to require the time step for the entire 
mesh to be reduced by a factor of four or more. 

Figure 1101 compares the force-reduction curves for three simulations. The 
difference among these simulations is the maximum allowed time step. The 
time_factor variable in the plot legend is a premultiplying factor applied to 
the mesh stable time step. The simulation using time_factor 0.1 required no 
subcycling to complete. At first glance, it seems that the simulation without 
subcycling is capturing more of the reflected waves than the one using subcycling 
at the beginning of the deformation. However, if we look at a close-up of this 
region, we find that the simulations cross the same points where they exist; 
Figure 1111 That is, we are capturing the same curve with varying levels of 
detail. This implies that the additional oscillations seen for time_factor=0.1 
are a result of having a smaller time step, and not a failing of the subcycling 
implementation. 

We also attempted a simulation of this deformation using our implicit formu- 
lation. The force-deformation results are shown in Figure IT51 and IT3l compared 
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Figure 9: Initial cube for the tests that follow. This cube is meshed with 384 
tetrahedra. The heavy lines denote boundaries between twenty-sided polyhedral 
grains. 



to the explicit formulation at time_factor of 0.5. (Note that the implicit model 
was capable of stepping as large as time_factor 0.75, but the results at this time 
step are not stationary as defined above) . We plot the time taken by each model 
versus the time step in Figure IT?! Interestingly, we start to see a negative return 
when we increase the time step past 75% of the mesh time. This is due to the 
larger number of subcycles required for the full mesh time step to proceed. The 
number of extra evaluations required outweighs the 33% larger time step for 
this case. In fact, the simulation with time_factor=1.0 required almost as much 
time to complete as that with time_factor=0.5. Thus, blindly increasing the 
time step up to the value allowed by the mesh does not necessarily produce the 
most efficient simulation. Note also that the implicit model is slower at every 
time step by about a factor of 5; this coincides nicely with the predictions of 
Figure 03 

Another way in which we can check our subcycling implementation is by 
examining the predicted material texture at the end of the deformation. Each 
of our rolling tests was given the initial texture in Figure ITTI Figure ITSI shows 
that we recover similar <lll>textures for all the explicit simulations, given 
the same initial textures. Also, the implicit model predicts similar textures to 
the explicit for the same time step; see Figure .These textures agree well with 
measured textures from the literature. In order to produce these textures, we 
reduced the interaction parameter ao to 2 x 10 -5 (see Table^). FigurelTBIshows 
the final textures obtained from implicit model simulations, using time steps at 
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50% and 10% of the mesh value. These textures are similar to those predicted 
by the explicit model. 

Note that the predicted forces in the foregoing figures are quite small, on the 
order of 10 N. Since this force is applied to a 1 mm cube, this translates directly 
to 10 MPa. The low force values returned are due to our rather small choice of 
viscosity exponent (m = 0.1). This exponent gives very good force matching for 
the quasi-static cases above, but it does not reflect the physics of the dynamic 
simulations very well. This viscosity exponent was chosen for these simulations 
to test the limitations of our explicit model. High-viscosity deformations will 
have lower rates, thus less plasticity and hardening, and therefore will easily 
converge for larger times (i.e., At c will increase with the viscosity exponent). 
By taking on the more challenging rate- independent limit, we can be sure our 
model will perform well in the somewhat easier high-viscosity simulations. To 
demonstrate this, we present Figure El which shows results for time_factors of 
1.0 and 0.75 with the viscosity exponent changed to 1.0. The forces increase 
significantly from what we saw in the rate-independent limit; now, our forces 
imply stresses in the range of several GPa. Interestingly, the simulation using 
the full time step was 25% (6 minutes) faster than the simulation with 75% of 
the mesh step for this set of parameters. Since the larger viscosity exponent 
leads to less plasticity and therefore less hardening, it follows that the At c val- 
ues for these simulations are longer, reducing the number of subcycles needed 
to converge. Figure IT§1 shows the texture for the case with time_factor=1.0; 
our agreement with measured values degrades. In fact, we detect no texture 
evolution whatsoever, since the amount of plastic deformation has reduced con- 
siderably due to the increased viscosity exponent. This question, unfortunately 
cannot be resolved at the present time. The published texture data is measured 
after the completed deformation, after a period of time long enough to allow 
the deformed sample to relax. Presumably, the texture evolution continues after 
the test is completed. A more accurate comparison would be drawn between 
the current results and in situ texture evolution data. The appropriate set of 
model parameters may then be determined. 

4.2 Polycrystal tests 

We have also performed some simulations of polycrystals under similar condi- 
tions. We have used both Taylor averaging and direct numerical simulation 
(DNS) to model the polycrystalline behavior based on the single-crystal model. 
A discussion of the merits and drawbacks of these approaches has been discussed 
elsewhere (see Zhao, et al |S3)- We provide some of the results of these simula- 
tions here as they pertain to the explicit constitutive update scheme, with and 
without subcycling. 

We noted above the large fluctuations in force-deformation response for a 
single integration point using the subcycling implementation. These oscillations 
had little effect on the overall results for single crystals; presumably their effects 
were damped out by the overall bulk of the available quadrature points. How- 
ever, we find that these fluctuations are important for modeling of polycrystals. 



17 



-10 - 

"llllllllllllllllllllll.il 

0.1 2 3 0.4 5 

Reduction, mm 



Figure 10: Comparison of force-deformation curves for explicit model runs with 
several different time steps. 
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Figure 11: Close-up of the foregoing figure, with symbols designating points 
returned. 
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Figure 12: Force response predictions for the explicit and implicit model, both 
using 50% of the maximal mesh time step. 
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Figure 13: Close-up of the implicit and explicit force curves. Note that the 
symbols lie atop one another. 
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Figure 14: Initial <111> texture for the tests in this section. 



Specifically, we tested a sample using the same mesh as the above, but applied 
the deformation to a polycrystal having 91 orientations using a Taylor averaging 
approach; that is, we presume the overall deformation gradient and deformation 
rate are the same for all orientations, and average to find the resulting stress. 
Where the subcycling approach allowed our simulations to use the 100% of the 
time step allowable by the mesh for a single crystal (i.e., time_factor=0.75), we 
were forced to reduce to a time step at 25% of that required by the mesh for 
this polycrystal. Simulations applying DNS show similar behavior. We feel the 
difficulty is a product of the interfaces of the polycrystalline grains. Instead of 
having only a gradient between the part of the crystal that is heavily deformed 
and the portion that is still nearly elastic (as in the single-crystal case), the 
polycrystal introduces gradients between parts of the crystal that have simi- 
lar overall deformation gradients and yet different plastic behavior. It stands to 
reason that neighboring points in a polycrystalline mesh may well require vastly 
different numbers of subcycles to meet the At c constraint. At points such as 
this, one element will then be on the more stable converged path, while another 
is on the somewhat volatile subcycling path. The result is that subcycling loses 
some of its effectiveness for polycrystal modeling. 

Subcycling still benefits the modeling effort, however. Instead of being forced 
to a small step of 10% the overall mesh step, we can still increase our time step 
by a factor of 2.5. The results for averaging 91 orientations took about four 
days, which is a linear scaling from the time taken by a single orientation at 
the same time step. The results are shown below; the force results in Figure QUI 
and the final texture in Figure |2"T1 
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(c) 0.50. (d) 0.10. 



Figure 15: Final textures for the explicit model simulations for different time 
factors. 
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Figure 16: Textures predicted by the implicit model simulations for different 
time factors. 
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Figure 17: Comparison of running time versus time step for both implicit and 
explicit models. 
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Figure 18: Force-deformation curves for two simulations using viscosity expo- 
nent m — 1.0. 




Figure 19: Texture prediction for the high-viscosity simulation. 
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Figure 20: Predicted force-deformation curve for the polycrystal sample. 
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Figure 21: Predicted texture for Taylor averaging simulation using 91 orienta- 
tions per integration point. 
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5 Conclusions 



We have demonstrated two model improvements which enhance the compu- 
tational speed of an implicit constitutive update, in cases dominated by the 
global time step. Beginning with an implicit formulation based on the model 
of Cuitino and Ortiz, we modified the update to an explicit formulation. This 
explicit formulation was shown to be five to six times faster than the original 
implicit algorithm per update step at the constitutive level, without loss of ac- 
curacy. The explicit update is limited by the size of time step that can be taken. 
We were able to increase this maximal time step by an order of magnitude for 
single crystals by introducing a subcycling algorithm to the explicit form. All 
three models produce the same stress-strain behavior at the integration point 
level for the same input parameters. 

We then applied our three update formulations to large-scale finite-clement 
calculations. The explicit update with subcycling was shown to be able to in- 
tegrate a larger time step for single crystals than even the implicit, since the 
subcycling procedure changes the larger input time step into several smaller 
steps that can be evaluated by our explicit algorithm. For polycrystals, the 
subcycling algorithm does not allow simulations to run at the full mesh time 
step. However, the subcycling implementation allows an increase in the maxi- 
mum time step over the explicit model without subcycling, and so results in an 
improvement in computational speed. 

Notably, the implicit and explicit models each have their uses. The implicit 
model is best suited to quasi-static simulations allowing larger time steps to 
be used 3 . In this sort of simulation, the gain of time per step realized by the 
explicit model is countered by the number of subcycles necessary to use the 
explicit model. Dynamic simulations, for which the time step is typically very 
short, are well-suited to the explicit model's capabilities. For such simulations, 
the explicit model proves a robust algorithm. 
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